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.sect tft tfn i 



^ 1 


TCGCGCGTTT 
AGCGCGCAAA 


CGGTGATGAC 
GCCACTACTG 


GGTGAAAACC TCTGACACAT GCAGCTCCCG 
CCACTTTTGG AGACTGTGTA CGTCGAGGGC 


51 


GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 
CTCTGCCAGT;GTCGAACAGA CATTCGCCTA CGGCCCTCGT CTGTTCGGGC 


101 


TCAGGGCGCG 
AGTCCCGCGC 


TCAGCGGGTG 
AGTCGCCCAC 


TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
AACCGCCCAC. AGCCCCGACC GAATTGATAC 








Hindlll 


151 


CGGCATCAGA 
GCCGTAGTCT 


GCAGATTGTA 
CGTCTAACAT 


CTGAGAGTGC ACCATATGAA GCTTTTTGCA 
GACTCTCACG TGGTATACTT CGAAAAACGT 


201 


AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG AATAGCTCAG 
TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GATGAAGACC TTATCGAGTC 


251 


AGGCCGAGGC 
TCCGGCTCCG 


GGCCTCGGCC 
CCGGAGCCGG 


TCTGCATAAA TAAAAAAAAT TAGTCAGCCA 
AGACGTATTT ATTTTTTTTA ATCAGTCGGT 


301 


TGGGGCGGAG 
ACCCCGCCTC 


AATGGGCGGA 
TTACCCGCCT 


ACTGGGCGGG GAGGGAATTA TTGGCTATTG 
TGACCCGCCC CTCCCTTAAT AACCGATAAC. 


351 


GCCATTGCAT 
CGGTAACGTA 


ACGTTGTATC 
TGCAACATAG 


TATATCATAA TATGTACATT TATATTGGCT 
ATATAGTATT ATACATGTAA ATATAACCGA 


401 


CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT 


451 


TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 
ATCATTAGTT AATGCCCCAG TAATCAAGTA TCGGGTATAT ACCTCAAGGC 


501 


CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 
GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG 


551 


CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 
GGGCGGGTAA CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT 


601 


GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 
CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA TTTGACGGGT 


651 


CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG 
GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC 


701 


TCAATGACGG 
AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


GCCTGGCATT ATGCCCAGTA CATGACCTTA 
CGGACCGTAA TACGGGTCAT GTACTGGAAT 


751 


CGGGACTTTC 
GOCCTGAAAG 


CTACTTGGCA 
GATGAACCGT 


GTACATCTAC GTATTAGTCA TCGCTATTAC 
CATGTAGATG CATAATCAGT AGCGATAATG 


801 


CATGGTGATG 
GTACCACTAC 


CGGTTTTGGC 
GCCAAAACCG 


AGTACACCAA TGGGCGTGGA TAGCGGTTTG 
TCATGTGGTT ACCCGCACCT ATCGCCAAAC 


851 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT 
TAAAGGTTCA 


CTCCACCCCA TTGACGTCAA TGGGAGTTTG 
GAGGTGGGGT AACTGCAGTT ACCCTCAAAC 
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901 


TTTTGGCACC 
AAAACCGTGG 


AAAATCAACG 
TTTTAGTTGC 


GGACTTTCCA AAATGTCGTA 
CCTGAAAGGT TTTACAGCAT 


ATAACCCCGC 
TATTGGGGCG 






951 


CCCGTTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 
GGGCAACTGC. GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT 






1001 


GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 
CGTCTCGAGC AAATCACTTG GCAGTCTAGC GGACCTCTGC GGTAGGTGCG 






1051 


TGTTTTGACC 
ACAAAACTGG 


TCCATAGAAG 
AGGTATCTTC 


ACACCGGGAC CGATCCAGCC 
TGTGGCCCTC GCTAGGTCGG 


TCCGCGGCCG 
AGGCGCCGGC 






1101 


GGAACGGTGC 
CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


GGATTCCCCG TGCCAAGAGT 
CCTAAGGGGC ACGGTTCTCA 


GACGTAAGTA 
CTGCATTCAT 






1151 


CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 
GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGAGAATA CGTACGXTAT 






1201 


CTGTTTTTGG 
GACAAAAACC 


CTTGGGGCCT 
GAACCCCGGA 


ATACACCCCC GCTCCTTATG 
TATGTGGGGG CGAGGAATAC 


CTATAGGTGA 
GATATCCACT 






1251 


TGGTATAGCT 
ACCATATCGA 


TAGCCTATAG 
ATCGGATATC 


GTGTGGGTTA TTGACCATTA 
CACACCCAAT AACTGGTAAT 


TTGACCACTC 
AACTGGTGAG 






1301 


CCCTATTGGT 
GGGATAACCA 


GACGATACTT 
CTGCTATGAA 


TCCATTACTA ATCCATAACA 
AGGTAATGAT TAGGTATTGT 


TGGCTCTTTG 
ACCGAGAAAC 






1351 


CCACAACTAT 
GGTGTTGATA 


CTCTATTGGC 
GAGATAACCG 


TATATGCCAA TACTCTGTCC 
ATATACGGTT ATGAGACAGG 


TTCAGAGACT 
AAGTCTCTGA 






1401 


GACACGGACT 
CTGTGCCTGA 


CTGTATTTTT 
GACATAAAAA 


ACAGGATGGG GTCCATTTAT 
TGTCCTACCC CAGGTAAATA 


TATTTACAAA 
ATAAATGTTT 






1451 


TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA 
AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT 






1501 


TAGCGTGGGA TCTCCGACAT CTCGGGTACG TGTTCCGGAC ATGGGCTCTT 
ATCGCACCCT AGAGGCTGTA GAGCCCATGC ACAAGGCCTG TACCCGAGAA 






1551 


CTCCGGTAGC GGCGGAGCTT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 




• 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT 






1651 


CTTAGGCACA 
GAATCCGTGT 


GCACAATGCC 
CGTGTTACGG 


CACCACCACC AGTGTGCCGC 
GTGGTGGTGG TCACACGGCG 


ACAAGGCCGT 
TGTTCCGGCA 






1701 


GGCGGTAGGG 
CCGCCATCCC 


TATGTGTCTG 
ATACACAGAC 


AAAATGAGCT CGGAGATTGG 
TTTTACTCGA GCCTCTAACC 


GCTCGCACCT 
CGAGCGTGGA 






1751 


GGACGCAGAT 
CCTGCGTCTA 


GGAAGACTTA 
CCTTCTGAAT 


AGGCAGCGGC AGAAGAAGAT 
TCCGTCGCCG TCTTCTTCTA 


GCAGGCAGCT 
CGTCCGTCGA 







1801 GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA ACGCCACGAC 
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1851 TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 
£ 



Sail EcoRI Xhol 



1951 GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCAGA CTCGAGCAAG 
CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTCT GAGCTCGTTC 



Xbal Ascl EcoRV BamHI Mlul 

2001 TCTAGAAAGG CGCGCCAAGA TATCAAGGAT CCACTACGCG TTAGAGCTCG 
AGATCTTTCC GCGCGGTTCT ATAGTTCCTA GGTGATGCGC AATCTCGAGC 



2051 CTGATCAGCC TCGACTGTGC CTTCTAGTTG CCAGCCATCT GTTGTTTGCC 
GACTAGTCGG AGCTGACACG GAAGATCAAC GGTCGGTAGA CAACAAACGG 



2101 CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTQC CACTGTCCTT 
GGAGGGGGCA CGGAAGGAAC TGGGACCTTC CACGGTGAGG GTGACAGGAA 



2151 TCCTAATAAA ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC 
AGGATTATTT TACTCCTTTA ACGTAGCGTA ACAGACTCAT CCACAGTAAG 



2201 TATTCTGGGG GGTGGGGTGG GGCAGGACAG CAAGGGGGAG GATTGGGAAG 
ATAAGACCCC CCACCCCACC CCGTCCTGTC GTTCCCCCTC CTAACCCTTC 



2251 ACAATAGCAG GCATGCTGGG GAGCTCTTCC GCTTCCTCGC TCACTGACTC 
TGTTATCGTC CGTACGACCC CTCGAGAAGG CGAAGGAGCG AGTGACTGAG 



2301 GCTGCGCTCG GTCGTTCGGC TGCGGCGAGC GGTATCAGCT CACTCAAAGG 
CGACGCGAGC CAGCAAGCCG ACGCCGCTCG CCATAGTCGA GTGAGTTTCC 



2351 CGGTAATACG GTTATCCACA GAATCAGGGG ATAACGCAGG AAAGAACATG 
GCCATTATGC CAATAGGTGT CTTAGTCCCC TATTGCGTCC TTTCTTGTAC 



2401 TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT 
ACTCGTTTTC CGGTCGTTTT CCX5GTCCTTG GCATTTTTCC GGCGCAACGA 



2451 GGCGTTTTTC CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC 
CCGCAAAAAG GTATCCGAGG CGGGGGGACT GCTCGTAGTG TTTTTAGCTG 



2501 GCTCAAGTCA GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGGCG 
CGAGTTCAGT CTCCACCGCT TTGGGCTGTC CTGATATTTC TATGGTCCGC 



2551 TTTCCCCCTG GAAGCTCCCT CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT 
AAAGGGGGAC CTTCGAGGGA GCACGCGAGA GGACAAGGCT GGGACGGCGA 



2601 TACCGGATAC CTGTCCGCCT TTCTCCCTTC GGGAAGCGTG GCGCTTTCTC 
ATGGCCTATG GACAGGCGGA AAGAGGGAAG CCCTTCGCAC CGCGAAAGAG 



2651 AATGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT TCGCTCCAAG 
TTACGAGTGC GACATCCATA GAGTCAAGCC ACATCCAGCA AGCGAGGTTC 



2701 CTGGGCTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT GCGCCTTATC 
GACCCGACAC ACGTGCTTGG GGGGC AAGTC GGGCTGGCGA CGCGGAATAG 
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27 51 CGGTAACTAT CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC 
GCCATTGATA GCAGAACTCA GGTTGGGCCA TTCTGTGCTG AATAGCGGTG 



2801 TGGCAGCAGC CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT 
ACCGTCGTCG V.GTGACC ATTG TCCTAATCGT CTCGCTCCAT ACATCCGCCA 



2851 GCTACAGAGT TCTTGAAGTG GTGGCCTAAC TACGGCTACA CTAGAAGGAC 
CGATGTCTCA AGAACTTCAC CACCGGATTG ATGCCGATGT GATCTTCCTG 



2901 AGTATTTGGT ATCTGCGCTC TGCTGAAGCC AGTTACCTTC GGAAAAAGAG 
TCATAAACCA TAGACGCGAG ACGACTTCGG TCAATGGAAG CCTTTTTCTC 



2951 TTGGTAGCTC TTGATCCGGC AAACAAACCA CCGCTGGTAG CGGTGGTTTT 
AACCATCGAG AACTAGGCCG TTTGTTTGGT GGCGACCATC GCCACCAAAA 



3001 TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT CTCAAGAAGA 
AAACAAACGT TCGTCGTCTA ATGCGCGTCT TTTTTTCCTA GAGTTCTTCT 



3051 TCCTTTGATC TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC 
AGGAAACTAG AAAAGATGCC CCAGACTGCG AGTCACCTTG CTTTTGAGTG 



3101 GTTAAGGGAT TTTGGTCATG AGATTATCAA AAAGGATCTT CACCTAGATC 
CAATTCCCTA AAACCAGTAC TCTAATAGTT TTTCCTAGAA GTGGATCTAG 



3151 CTTTTAAATT AAAAATGAAG TTTTAAATCA ATCTAAAGTA TATATGAGTA 
GAAAATTTAA TTTTTACTTC AAAATTTAGT TAGATTTCAT ATATACTCAT 



3201 AACTTGGTCT GACAGTTACC AATGCTTAAT CAGTGAGGCA CCTATCTCAG 
TTGAACCAGA CTGTCAATGG TTACGAATTA GTCACTCCGT GGATAGAGTC 



3251 CGATCTGTCT ATTTCGTTCA TCCATAGTTG CCTGACTCCC CGTCGTGTAG 
GCTAGACAGA TAAAGCAAGT AGGTATCAAC GGACTGAGGG GCAGCACATC 



3301 ATAACTACGA TACGGGAGGG CTTACCATCT GGCCCCAGTG CTGCAATGAT 
TATTGATGCT ATGCCCTCCC GAATGGTAGA CCGGGGTCAC GACGTTACTA 



3351 ACCGCGAGAC CCACGCTCAC CGGCTCCAGA TTTATCAGCA ATAAACCAGC 
TGGCGCTCTG GGTGCGAGTG GCCGAGGTCT AAATAGTCGT TATTTGGTCG 



3401 CAGCCGGAAG GGCCGAGCGC AGAAGTGGTC CTGCAACTTT ATCCGCCTCC 
GTCGGCCTTC CCGGCTGGCG TCTTCACCAG GACGTTGAAA TAGGCGGAGG 



3451 ATCCAGTCTA TTAATTGTTG CCGGGAAGCT AGAGTAAGTA GTTCGCCAGT 
TAGGTCAGAT AATTAACAAC GGCCCTTCGA TCTCATTCAT CAAGCGGTCA 



3501 TAATAGTTTG CGCAACGTTG TTGCCATTGC TACAGGCATC GTGGTGTCAC 
ATTATCAAAC GCGTTGCAAC AACGGTAACG ATGTCCGTAG CACCACAGTG 



3551 GCTCGTCGTT TGGTATGGCT TCATTCAGCT CCGGTTCCCA ACGATCAAGG 
CGAGCAGCAA ACCATACCGA AGTAAGTCGA GGCCAAGGGT TGCTAGTTCC 



3601 CGAGTTACAT GATCCCCCAT GTTGTGCAAA AAAGCGGTTA GCTCCTTCGG 
GCTCAATGTA CTAGGGGGTA CAACACGTTT TTTCGCCAAT CGAGGAAGCC 



3651 TCCTCCGATC GTTGTCAGAA GTAAGTTGGC CGCAGTGTTA TCACTCATGG 
AGGAGGCTAG CAACAGTCTT CATTCAACCG GCGTCACAAT AGTGAGTACC 
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3701 TTATGGCAGC ACTGCATAAT TCTCTTACTG TCATGCCATC CGTAAGATGC 
AATACCGTCG TGACGTATTA AGAGAATGAC AGTACGGTAG GCATTCTACG 



3751 TTTTCTGTGA CTGGTGAGTA CTCAACCAAG TCATTCTGAG AATAGTGTAT 
AAAAGACACT .GACCACTCAT GAGTTGGTTC AGTAAGACTC TTATCACATA 



3801 GCGGCGACCG AGTTGCTCTT GCCCGGCGTC AATACGGGAT AATACCGCGC 
CGCCGCTGGC TCAACGAGAA CGGGCCGCAG TTATGCCCTA TTATGGCGCG 



3851 CACATAGCAG AACTTTAAAA GTGCTCATCA TTGGAAAACG TTCTTCGGGG 
GTGTATCGTC TTGAAATTTT CACGAGTAGT AACCTTTTGC AAGAAGCCCC 



3901 CGAAAACTCT CAAGGATCTT ACCGCTGTTG AGATCCAGTT CGATGTAACC 
GCTTTTGAGA GTTCCTAGAA TGGCGACAAC TCTAGGTCAA GCTACATTGG 



3951 CACTCGTGCA CCCAACTGAT CTTCAGC ATC • TTTTACTTTC ACCAGCGTTT 
GTGAGCACGT GGGTTGACTA GAAGTCGTAG AAAATGAAAG TGGTCGCAAA 



4001 CTGGGTGAGC AAAAACAGGA AGGCAAAATG CCGCAAAAAA GGGAATAAGG 
GACCCACTCG TTTTTGTCCT TCCGTTTTAC GGCGTTTTTT CCCTTATTCC 



4051 GCGACACGGA AATGTTGAAT ACTCATACTC TTCCTTTTTC AATATTATTG 
CGCTGTGCCT TTACAACTTA TGAGTATGAG AAGGAAAAAG TTATAATAAC 



4101 AAGCATTTAT CAGGGTTATT GTCTCATGAG CGGATACATA TTTGAATGTA 
TTCGTAAATA GTCCCAATAA CAGAGTACTC GCCTATGTAT AAACTTACAT 



4151 TTTAGAAAAA TAAACAAATA GGGGTTCCGC GCACATTTCC CCGAAAAGTG 
AAATCTTTTT ATTTGTTTAT CCCCAAGGCG CGTGTAAAGG GGCTTTTCAC 



4201 CCACCTGACG TCTAAGAAAC CATTATTATC ATGACATTAA CCTATAAAAA 
GGTGGACTGC AGATTCTTTG GTAATAATAG TACTGTAATT GGATATTTTT 



4251 TAGGCGTATC ACGAGGCCCT TTCGTC 
ATCCGCATAG TGCTCCGGGA AAGCAG 
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^Pftlb^&Z^l TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG 
AGCGCGCAAA GCCACTACTG CCACTTTTGG AGACTGTGTA CGTCGAGGGC 



51 GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 
CTCTGCCAGT GTCGAACAGA CATTCGCCTA CGGCCCTCGT CTGTTCGGGC 



101 TCAGGGCGGG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
AGTCCCGCGC AQTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC 



Hindi I I 



151 CGGCATCAGA GCAGATTGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA 
GCCGTAGTCT CGTCTAACAT GACTCTCACG TGGTATACTT CGAAAAACGT 



StuI 



AatI 5 



201 AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG AATAGCTCAG 
TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GATGAAGACC TTATCGAGTC 



Sfil 

251 AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA 
TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT 



301 TGGGGCGGAG AATGGGCGGA ACTGGGCGGG GAGGGAATTA TTGGCTATTG 
ACCCCGCCTC TTACCCGCCT TGACCCGCCC CTCCCTTAAT AACCGATAAC 



351 GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT TATATTGGCT 
CGGTAACGTA TGCAACATAG ATATAGTATT ATACATGTAA ATATAACCGA 



401 CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT 



451 TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 
ATCATTAGTT AATGCCCCAG TAATCAAGTA TCGGGTATAT ACCTCAAGGC 



501 CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 
GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG 



551 CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 
GGGCGGGTAA CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT 



601 GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 
CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA TTTGACGGGT 



651 CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG 
GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC 



701 TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 
AGTTACTGCC ATTTACCGGG CGGACCGTAA TACGGGTCAT GTACTGGAAT 



SnaBI 



751 CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 
GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG 
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801 CATGGTGATG CGGTTTTGGC AGTACACCAA TGGGCGTGGA TAGCGGTTTG 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC 



851 ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 
TGAGTGCCCC TAAAGGTTCA GAGGTGGGGT AACTGCAGTT ACCCTCAAAC 



901 TTTTGGCACC^AAAATCAACG GGACTTTCCA AAATGTCGTA ATAACCCCGC 
AAAACCGTGG TTTT&GTTGC CCTGAAAGGT TTTACAGCAT TATTGGGGCG 



951 CCCGTTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 
GGGCAACTGC GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT 

1001 GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 
CGTCTCGAGC AAATCACTTG GCAGTCTAGC GGACCTCTGC GGTAGGTGCG 



Xmalll 

SacII 

Kspl 

EclXI 
EagI 



1051 TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 
ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC 



1101 GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 
CCTTGCCACG TAACCTTGCG CCTAAGGGGC ACGGTTCTCA CTGCATTCAT 



PpulOI 
Nsil 



1151 CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 
GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGAGAATA CGTACGATAT 



1201 CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT 



Espl 
Celll 
Bpull02I 



1251 TGGTATAGCT TAGCCTATAG GTGTGGGTTA TTGACCATTA TTGACCACTC 
ACCATATCGA ATCGGATATC CACACCCAAT AACTGGTAAT AACTGGTGAG 

1301 CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TGGCTCTTTG 
GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATTGT ACCGAGAAAC 



1351 CCACAACTAT CTCTATTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT 
GGTGTTGATA GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA 



1401 GACACGGACT CTGTATTTTT ACAGGATGGG GTCCATTTAT TATTTACAAA 
CTGTGCCTGA GACATAAAAA TGTCCTACCC CAGGTAAATA ATAAATGTTT 



1451 TTCACATATA CAACAACGCX GTCCCCCGTG CCCGCAGTTT TTATTAAACA 
AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT 



Mrol 



BspEI 



BseAI 



AccIII 



1501 TAGCGTGGGA TCTCCGACAT CTCGGGTACG TGTTCCGGAC ATGGGCTCTT 
ATCGCACCCT AGAGGCTGTA GAGCCCATGC ACAAGGCCTG TACCCGAGAA 



1551 CTCCGGTAGC GGCGGAGCTT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 



1601 GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT 



1651 CTTAGGCACA GCACAATGCC CACCACCACC AGTGTGCCGC ACAAGGCCGT 
GAATCCGTGT CGTGTTACGG GTGGTGGTGG TCACACGGCG TGTTCCGGCA 



1701 GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT 
CCGCCATCCC ATACACAGAC TTTTACTCGA GCCTCTAACC CGAGCGTGGA 



Bfrl 



Aflll PvuII 



1751 GGACGCAGAT GGAAGACTTA AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT 
CCTGCGTCTA CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA 



PvuII Hpal 

1801 GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA ACGCCACGAC 



Hpal 

1851 TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 



+2 S€fijaw53w M Q w n 

Sail ' 

1951 GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCATG CAGTGGAACT 
CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTAC GTCACCTTGA 



+2STAF HQT LQDP RVR GLY 
2001 CCACTGCCTT CCACCAAACT CTGCAGGATC CCAGAGTCAG GGGTCTGTAT 
GGTGACGGAA GGTGGTTTGA GACGTCCTAG GGTCTCAGTC CCCAGACATA 



+ 2LPAG GSS SGT VNPA PNI 
2051 CTTCCTGCTG GTGGCTCCAG TTCAGGAACA GTAAACCCTG CTCCGAATAT 
GAAGGACGAC CACCGAGGTC AAGTCCTTGT CATTTGGGAC GAGGCTTATA 



FIG. 2D 



+ 2 ASH ISSI SAR TGD PVT 
2101 TGCCTCTCAC ATCTCGTCAA TCTCCGCGAG GACTGGGGAC CCTGTGACGA 
ACGGAGAGTG TAGAGCAGTT AGAGGCGCTC CTGACCCCTG GGACACTGCT 



+ 2NMEN ITS GFLG PLL VLQ 
2151 ACATGGAGAA CATCACATCA GGATTCCTAG GACCCCTGCT CGTGTTACAG 
TGTACCTCTT GTAGTGTAGT CCTAAGGATC CTGGGGACGA GCACAATGTC 



+ 2AGFF -LLT RIL TI PQ SLD 
2201 GCGGGGTTTT TCTTGTTGAC AAGAATCCTC ACAATACCGC AGAGTCTAGA 
CGCCCCAAAA AGAACAACTG TTCTTAGGAG TGTTATGGCG TCTCAGATCT 



+ 2 SWW TSLN FLG GSP VCL 
2251 CTCGTGGTGG ACTTCTCTCA ATTTTCTAGG GGGATCTCCC GTGTGTCTTG 
GAGCACCACC TGAAGAGAGT TAAAAGATCC CCCTAGAGGG CACACAGAAC 



+2GQNS -QSP TSNH SPT SCP 
2301 GCCAAAATTC GCAGTCCCCA ACCTCCAATC ACTCACCAAC CTCCTGTCCT 
CGGTTTTAAG CGTCAGGGGT TGGAGGTTAG TGAGTGGTTG GAGGACAGGA 



+2PICP GYR WMC LRRF IIF 
2351 CCAATTTGTC CTGGTTATCG CTGGATGTGT CTGCGGCGTT TTATCATATT 
GGTTAAACAG GACCAATAGC GACCTACACA GACGCCGCAA AATAGTATAA 



+ 2 
2401 


LFI LLLC LIF LL V 
CCTCTTCATC CTGCTGCTAT GCCTCATCTT CTTATTGGTT 
GGAGAAGTAG GACGACGATA CGGAGTAGAA GAATAACCAA 


L L D 
CTTCTGGATT 
GAAGACCTAA 


4-2 
2451 


YQGM LPV CPLI PGS 
ATCAAGGTAT GTTGCCCGTT TGTCCTCTAA TTCCAGGATC 
TAGTTCCATA CAACGGGCAA ACAGGAGATT AAGGTCCTAG 


T T T 
AACAACAACC 
TTGTTGTTGG 


+2 


STGP CKT CTT PAQG NSM 
BstAP I 




BspMI EcoNI 




2501 


AGTACGGGAC CATGCAAAAC CTGCACGACT CCTGCTCAAG 
TCATGCCCTG GTACGTTTTG GACGTGCTGA GGACGAGTTC 

Bsgl 


GCAACTCTAT 
CGTTGAGATA 


+2 


FPS CCCT KPT DGN 


C T C 



2551 GTTTCCCTCA TGTTGCTGTA CAAAACCTAC GGATGGAAAT TGCACCTGTA 
CAAAGGGAGT ACAACGACAT GTTTTGGATG CCTACCTTTA ACGTGGACAT 



+ 2IPIP SSW A F A' K YLW EWA 
BstXI 



2601 TTCCCATCCC ATCGTCCTGG GCTTTCGCAA AATACCTATG GGAGTGGGCC 
AAGGGTAGGG TAGCAGGACC CGAAAGCGTT TTATGGATAC CCTCACCCGG 



+2SVRF SWL SLL VPFV QWF 
2651 TCAGTCCGTT TCTCTTGGCT CAGTTTACTA GTGCCATTTG TTCAGTGGTT 
AGTCAGGCAA AGAGAACCGA GTCAAATGAT CACGGTAAAC AAGTCACCAA 



+ 2 VGL SPTV WLS AIW MMW 
27 01 CGTAGGGCTT TCCCCCACTG TTTGGCTTTC AGCTATATGG ATGATGTGGT 
GCATCCCGAA AGGGGGTGAC AAACCGAAAG TCGATATACC TACTACACCA 



FIG. 2E 



+ 2YWGP SLY S IVS PFI PLL 
2751 ATTGGGGGCC AAGTCTGTAC AGCATCGTGA GTCCCTTTAT ACCGCTGTTA 
TAACCCCCGG TTCAGACATG TCGTAGCACT CAGGGAAATA TGGCGACAAT 



+ 2PIFF CLW VYI * 

BstZ17 I Xhol 



Bstll07I PaeR7I 



2801 CCAATTTTCT TTTGTCTCTG GGTATACATT TAAGAATTCA GACTCGAGCA 
GGTTAAAAGA AAACAGAGAC CCATATGTAA ATTCTTAAGT CTGAGCTCGT 



AscI EcoRV Mlul 



2851 AGTCTAGAAA GGCGCGCCAA GATATCAAGG ATCCACTACG CGTTAGAGCT 
TCAGATCTTT CCGCGCGGTT CTATAGTTCC TAGGTGATGC GCAATCTCGA 



Bell 



2901 CGCTGATCAG CCTCGACTGT GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG 
GCGACTAGTC GGAGCTGACA CGGAAGATCA ACGGTCGGTA GACAACAAAC 



2951 CCCCTCCCCC GTGCCTTCCT TGACCCTGGA AGGTGCCACT CCCACTGTCC 
GGGGAGGGGG CACGGAAGGA ACTGGGACCT TCCACGGTGA GGGTGACAGG 



3001 TTTCCTAATA AAATGAGGAA ATTGCATCGC ATTGTCTGAG TAGGTGTCAT 
AAAGGATTAT TTTACTCCTT TAACGTAGCG TAACAGACTC ATCCACAGTA 



3051 TCTATTCTGG GGGGTGGGGT GGGGCAGGAC AGCAAGGGGG AGGATTGGGA 
AGATAAGACC CCCCACCCCA CCCCGTCCTG TCGTTCCCGC TCCTAACCCT 



3101 AGACAATAGC AGGCATGCTG GGGAGCTCTT CCGCTTCCTC GCTCACTGAC 
TCTGTTATCG TCCGTACGAC CCCTCGAGAA GGCGAAGGAG CGAGTGACTG 



3151 TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG CTCACTCAAA 
AGCGACGCGA GCCAGCAAGC CGACGCCGCT CGCCATAGTC GAGTGAGTTT 



Pci I 

3201 GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA 
CCGCCATTAT GCCAATAGGT GTCTTAGTCC CCTATTGCGT CCTTTCTTGT 



Pci I 

3251 TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG 
ACACTCGTTT TCCGGTCGTT TTCCGGTCCT TGGCATTTTT CCGGCGCAAC 



3301 CTGGCGTTTT TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG 
GACCGCAAAA AGGTATCCGA GGCGGGGGGA CTGCTCGTAG TGTTTTTAGC 



3351 ACGCTCAAGT CAGAGGTGGC GAAACCCGAC AGGACTATAA AGATACCAGG 
TGCGAGTTCA GTCTCCACCG CTTTGGGCTG TCCTGATATT TCTATGGTCC 



3401 CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG 
GCAAAGGGGG ACCTTCGAGG GAGCACGCGA GAGGACAAGG CTGGGACGGC 



FIG. 2F 



Haell 



CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC 
GAATGGCCTA TGGACAGGCG GAAAGAGGGA AGCCCTTCGC ACCGCGAAAG 



3501 


TCAATGCTCA CGCTGTAGGT ATCTCAGTTC 
AGTT ACGAGT ;.GCGACATCCA TAGAGTCAAG 


GGTGTAGGTC GTTCGCTCCA 
CCACATCCAG CAAGCGAGGT 






3551 


AGCTGGGCTG TGTGCACGAA CCCCCCGTTC 
TCGACCCGAC ACACGTGCTT GGGGGGCAAG 


AGCCCGACCG CTGCGCCTTA 
TCGGGCTGGC GACGCGGAAT 






3601 


Irrrrl^I ^ TCTTGA ^CAACCCG GTAAGACACG ACTTATCGCC 
AGGCCATTGA TAGCAGAACT CAGGTTGGGC CATTCTGTGC TGAATAGCGG 






3651 


™™™ GCCACTGGTA ACAGGATTAG CAGAGCGAGG TATGTAGGCG 
TGACCGTCGT CGGTGACCAT TGTCCTAATC GTCTCGCTCC ATACATCCGC 




■ 


3701 


™^T^ GA GTTCTTGA AG TGGTGGCCTA ACTACGGCTA CACTAGAAGG 
CACGATGTCT CAAGAACTTC ACCACCGGAT TGATGCCGAT GTGATCTTCC 






3751 


ACAGTATTTG GTATCTGCGC TCTGCTGAAG 
TGTCATAAAC CATAGACGCG AGACGACTTC 


CCAGTTACCT TCGGAAAAAG 
GGTCAATGGA AGCCTTTTTC 






3801 


AGTTGGTAGC TCTTGATCCG GCAAACAAAC 
TCAACCATCG AGAACTAGGC CGTTTGTTTG 


CACCGCTGGT AGCGGTGGTT 
GTGGCGACCA TCGCCACCAA 






3851 


TTTTTGTTTG CAAGCAGCAG ATTACGCGCA 
AAAAACAAAC GTTCGTCGTC TAATGCGCGT 


GAAAAAAAGG ATCTCAAGAA 
CTTTTTTTCC TAGAGTTCTT 






3901 


GATCCTTTGA TCTTTTCTAC GGGGTCTGAC 
CTAGGAAACT AGAAAAGATG CCCCAGACTG 


GCTCAGTGGA ACGAAAACTC 
CGAGTCACCT TGCTTTTGAG 






3951 


ACGTTAAGGG ATTTTGGTCA TGAGATTATC 
TGCAATTCCC TAAAACCAGT ACTCTAATAG 


AAAAAGGATC TTCACCTAGA 
TTTTTCCTAG AAGTGGATCT 






4001 


TCCTTTTAAA TTAAAAATGA AGTTTTAAAT 
AGGAAAATTT AATTTTTACT TCAAAATTTA 


CAATCTAAAG TATATATGAG 
GTTAGATTTC ATATATACTC 






4051 


TAAACTTGGT CTGACAGTTA CCAATGCTTA 
ATTTGAACCA GACTGTCAAT GGTTACGAAT 


ATCAGTGAGG CACCTATCTC 
TAGTCACTCC GTGGATAGAG 










Earoll05I 










AspEI 






4101 


AGCGATCTGT CTATTTCGTT CATCCATAGT 
TCGCTAGACA GATAAAGCAA GTAGGTATCA 


TGCCTGACTC CCCGTCGTGT 
ACGGACTGAG GGGCAGCACA 






4151 


AGATAACTAC GATACGGGAG GGCTTACCAT 
TCTATTGATG CTATGCCCTC CCGAATGGTA 


CTGGCCCCAG TGCTGCAATG 
GACCGGGGTC ACGACGTTAC 
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CfrlOI 



BsrFI 



4201 ATACCGCGAG ACCCACGCTC ACCGGCTCCA GATTTATCAG CAATAAACCA 
TATGGCGCTC TGGGTGCGAG TGGCCGAGGT CTAAATAGTC GTTATTTGGT 
Bsal 



4251 GCCAGCCGGA AGGGCCGAGC GCAGAAGTGG TCCTGCAACT TTATCGGCCT 
CGGTCGGCCT TCCCGGCTCG CGTCTTCACC AGGACGTTGA AATAGGCGGA 



4301 CCATCCAGTC TATTAATTGT TGCCGGGAAG . CTAGAGTAAG TAGTTCGCCA 
GGTAGGTCAG ATAATTAACA ACGGCCCTTC GATCTCATTC ATCAAGCGGT 



Fspl 
Avill 



Aos I 



4351 GTTAATAGTT TGCGCAACGT TGTTGCCATT GCTACAGGCA TCGTGGTGTC 
CAATTATCAA ACGCGTTGCA ACAACGGTAA CGATGTCCGT AGCACCACAG 



4401 ACGCTCGTCG TTTGGTATGG CTTCATTCAG CTCCGGTTCC CAACGATCAA 
TGCGAGCAGC AAACCATACC GAAGTAAGTC GAGGCCAAGG GTTGCTAGTT 



4451 GGCGAGTTAC ATGATCCCCC ATGTTGTGCA AAAAAGCGGT TAGCTCCTTC 
CCGCTCAATG TACTAGGGGG TACAACACGT TTTTTCGCCA ATCGAGGAAG 



Pvul 



4501 GGTCCTCCGA TCGTTGTCAG AAGTAAGTTG GCCGCAGTGT TATCACTCAT 
CCAGGAGGCT AGCAACAGTC TTCATTCAAC CGGCGTCACA ATAGTGAGTA 



4551 GGTTATGGCA GCACTGCATA ATTCTCTTAC TGTCATGCCA TCCGTAAGAT 
CCAATACCGT CGTGACGTAT TAAGAGAATG ACAGTACGGT AGGCATTCTA 



4601 GCTTTTCTGT GACTGGTGAG TACTCAACCA AGTCATTCTG AGAATAGTGT 
CGAAAAGACA CTGACCACTC ATGAGTTGGT TCAGTAAGAC TCTTATCACA 



Beg I 



4651 ATGCGGCGAC CGAGTTGCTC TTGCCCGGCG TCAATACGGG ATAATACCGC 
TACGCCGCTG GCTCAACGAG AACGGGCCGC AGTTATGCCC TATTATGGCG 



XmnI 



Asp700 



4701 GCCACATAGC AGAACTTTAA AAGTGCTCAT CATTGGAAAA CGTTCTTCGG 
CGGTGTATCG TCTTGAAATT TTCACGAGTA GTAACCTTTT GCAAGAAGCC 



4751 GGCGAAAACT CTCAAGGATC TTACCGCTGT TGAGATCCAG TTCGATGTAA 
CCGCTTTTGA GAGTTCCTAG AATGGCGACA ACTCTAGGTC AAGCTACATT 



4 801 CCCACTCGTG CACCCAACTG ATCTTCAGCA TCTTTTACTT TCACCAGCGT 
GGGTGAGCAC GTGGGTTGAC TAGAAGTCGT AGAAAATGAA AGTGGTCGCA 



FIG. 2H 



4851 TTCTGGGTGA GCAAAAACAG GAAGGCAAAA TGCCGCAAAA AAGGGAATAA 
AAGACCCACT CGTTTTTGTC CTTCCGTTTT ACGGCGTTTT TTCCCTTATT 



4 901 GGGCGACACG GAAATGTTGA ATACTCATAC TCTTCCTTTT TCAATATTAT 
CCCGCTGTGC CTTTACAACT TATGAGTATG AGAAGGAAAA AGTTATAATA 



4951 TGAAGCATTT ATCAGGGTTA TTGTCTCATG AGCGGATACA TATTTGAATG 
ACTTCGTAAA TAGTCCCAAT AACAGAGTAC TCGCCTATGT ATAAACTTAC 



5001 TATTTAGAAA AATAAACAAA TAGGGGTTCC GCGCACATTT CCCCGAAAAG 
ATAAATCTTT TTATTTGTTT ATCCCCAAGG CGCGTGTAAA GGGGCTTTTC 



5051 TGCCACCTGA CGTCTAAGAA ACCATTATTA TCATGACATT AACCTATAAA 
ACGGTGGACT GCAGATTCTT TGGTAATAAT AGTACTGTAA TTGGATATTT 



5101 AATAGGCGTA TCACGAGGCC CTTTCGTC 
TTATCCGCAT AGTGCTCCGG GAAAGCAG 



FIG.2L 
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ff>t\ TCGCGCGTTT CGQTGATG£>^ jTGAAAACC TCTGACACAT GCAGCTCCCG 
V/' AGCGCGCAAA GCCACTAC \,cACTTTTGG AGACTGTGTA CGTCGAGGGC 



51 GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 
CTCTGCCAGT GTCGAACAGA CATTCGCCTA CGGCCCTCGT CTGTTCGGGC 

101 TCAGGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
AGTCCCGCGC AGTCGGCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC 



Of 



Hindi II 

151 CGGCATCAGA GCAGATTGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA 
GCCGTAGTCT CGTCTAACAT GACTCTCACG TGGTATACTT CGAAAAACGT 



StuI 
AatI 



201 AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG AATAGCTCAG 
TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GATGAAGACC TTATCGAGTC 



Sfil 



251 AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA 
TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT 

301 TGGGGCGGAG AATGGGCGGA ACTGGGCGGG GAGGGAATTA TTGGCTATTG 
ACCCCGCCTC TTACCCGCCT tgacccgccc ctcccttaat AACCGATAAC 



351 GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT TATATTGGCT 
CGGTAACGTA TGCAACATAG ATATAGTATT ATACATGTAA ATATAACCGA 



401 CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT 



451 TAGTAATCAA ITACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 
ATCATTAGTT AATGCCCCAG TAATCAAGTA TCGGGTATAT ACCTCAAGGC 



501 CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 
GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG 

551 CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 
GGGCGGGTAA CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT 

601 GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 
CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAAIGCC A TTTGACGGGT 

651 ■ CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG 
" GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC 

701 TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 
AGTTACTGCC ATTTACCGGG CGGACCGTAA TACGGGTCAT GTACTGGAAT 

SnaBI 



751 CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 
GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG 
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7/ 

>' 801 



CATGGTGATG CGGTTT1 ' . AGTACACCAA ■ TGGGCGTGGA TAGCGGTTTG 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC 



851 ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 
TGAGTGCCCC TAAAGGTTCA GAGGTGGGGT AACTGCAGTT ACCCTCAAAC 



901 TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ATAACCCCGC 
AAAACCGTGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TATTGGGGCG 



951 CCCGTTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 
GGGCAACTGC GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT 



1001 GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCACGC 
CGTCTCGAGC AAATCACTTG GCAGTCTAGC GGACCTCTGC GGTAGGTGCG 



Xmalll 



SacII 



Kspl 



EclXI 



EagI 



1051 TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 
ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC 



1101 GGAACGGTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GACGTAAGTA 
CCTTGCCACG TAACCTTGCG CCTAAGGGGC ACGGTTCTCA CTGCATTCAT 



PpulOI 



Nail 



1151 CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 
GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGAGAATA CGTACGATAT 



1201 CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT 



Eapl 



Celll 



Bpull02I 



1251 TGGTATAGCT TAGCCTATAG GTGTGGGTTA TTGACCATTA TTGACCACTC 
ACCATATCGA ATCGGATATC CACACCCAAT AACTGGTAAT AACTGGTGAG 



1301 CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TGGCTCTTTG 
GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATTGT ACCGAGAAAC 



1351 CCACAACTAT CTCTATTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT 
GGTGTTGATA GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA 



1401 GACACGGACT CTGTATTTTT ACAGGATGGG GTCCATTTAT TAT TT AC AAA 
CTGTGCCTGA GACATAAAAA TGTCCTACCC CAGGTAAATA ATAAATGTTT 



FIG. 3C 



1451 



TTCACATATA CAACAACbCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA 
AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT 



Mrol 



BspEI 



BseAI 



AccIII 



1501 TAGCGTGGGA TCTCCGACAT CTCGGGTACG TGTTCCGGAC ATGGGCTCTT 
ATCGCACCCT AGAGGCTGTA GAGCCCATGC ACAAGGCCTG TACCCGAGAA 



1551 CTCCGGTAGC GGCGGAGCTT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 



1601 GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT 



1651 CTTAGGCACA GCACAATGCC CACCACCACC AGTGTGCCGC ACAAGGCCGT 
GAATCCGTGT CGTGTTACGG GTGGTGGTGG TCACACGGCG TGTTCCGGCA 



1701 GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT 
CCGCCATCCC ATACACAGAC TTTTACTCGA GCCTCTAACC CGAGCGTGGA 



Bfrl 



Aflll 



1751 GGACGCAGAT GGAAGACTTA AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT 
CCTGCGTCTA CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA 



Hpal 

1801 GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA ACGCCACGAC 



Hpal 



1B51 TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 

Sail 



1951 GTCTTTTCTG CAGTCACCGX CGTCGACGAA WCAAGCAAT CATGGATGCA 
CAGAAAAGAC GTCAGTGGCA GCAGCTGCTT AAGTTCGTTA GTACCTACGT 



+3MKRG LCC V L L LCGA V. F V 
2001 ATGAAGAGAG GGCTCTGCTG TGTGCTGCTG CTGTGTGGAG CAGTCTTCGT 
TACTTCTCTC CCGAGACGAC ACACGACGAC GACACACCTC GTCAGAAGCA 
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+ 3 S P S 'A S Y Q V R N S TG L Y H 

Nhel Pmll 



EC047III PmaCI 



Afe I SexAI BbrPI 



2051 TTCGCCCAGC GCTAGCTACC AGGTGCGCAA CAGCACCGGC CTGTACCACG 
AAGCGGGfrcG CGATCGATGG TCCACGCGTT GTCGTGGCCG GACATGGTGC 



+ 3VTND CPN SSIV YEA ADA 
Pmll 

PmaCI 

BbrPI 

2101 TGACCAACGA CTGCCCCAAC AGCAGCATCG TGTACGAGGC CGCCGACGCC 
ACTGGTTGCT GACGGGGTTG TCGTCGTAGC ACATGCTCCG GCGGCTGCGG 



+ 3 I L K T PGC VPC V R E G N A S 
2151 ATCCTGCACA CCCCCGGCTG CGTGCCCTGC GTGCGCGAGG GCAACGCCAG 
TAGGACGTGT GGGGGCCGAC GCACGGGACG CACGCGCTCC CGTTGCGGTC 



+ 3 RCW VAMT PTV ATR DGK 
2201 CCGCTGCTGG GTGGCCATGA CCCCCACCGT GGCCACCCGC GACGGCAAGC 
GGCGACGACC CACCGGTACT GGGGGTGGCA CCGGTGGGCG CTGCCGTTCG 



+3LPAT QLR RHID L L V GSA 

Dralll 

2251 TGCCCGCCAC CCAGCTGCGC CGCCACATCG ACCTGCTGGT GGGCAGCGCC 
ACGGGCGGTG GGTCGACGCG GCGGTGTAGC TGGACGACCA CCCGTCGCGG 



+3TLCS ALY VGD LCGS VFL 
Dralll 



2301 ACCCTGTGCA GCGCCCTGTA CGTGGGCGAC CTGTGCGGCA GCGTGTTCCT 
TGGGACACGT CGCGGGACAT GCACCCGCTG GACACGCCGT CGCACAAGGA 



+3 VGQ LFTF SPR RHW TTQ 
2351 GGTGGGCCAG CTGTTCACCT TCAGCCCCCG CCGCCACTGG ACCACCCAGG 
CCACCCGGTC GACAAGTGGA AGTCGGGGGC GGCGGTGACC TGGTGGGTCC 



+3GCNC SI V PGHI TGH R M A 
2401 GCTGCAACTG CAGCATCTAC CCCGGCCACA TCACCGGCCA CCGCATGGCC 
CGACGTTGAC GTCGTAGATG GGGCCGGTGT AGTGGCCGGT GGCGTACCGG 



+3WDMM MNW SPT TlMEN ITS 
2451 TGGGACATGA TGATGAACTG GAGCCCCACC ACCMGGAGA ACATCACATC 
ACCCTGTACT ACT ACT TG AC CTCGGGGTGG TGOTACCTCT TGTAGTGTAG 



+3 GF L G P L L V L Q AGF FLL 
PpuMI 

2501 AGGATTCCTA GGACCCCTGC TCGTGTTACA GGCGGGGTTT TTCTTGTTGA FIG. 3E 

TCCTAAGGAT CCTGGGGACG AGCACAATGT CCGCCCCAAA AAGAACAACT 



+3 T R I L * T 1 % P QSLD S W W TSL 
2 5 Si CAAGAATCCT CACAATACCG CAGAGTCTAG ACTCGTGGTG GACTTCTCTC 
GTTCTTAGGA GTGTTATGGC GTCTCAGATC TGAGCACCAC CTGAAGAGAG 



+ 3 N F L G G3P V C L GQNS QSP 
2601 AATTTTCTAG GGGGATCTCC CGTGTGTCTT GGCCAAAATT CGCAGTCCCC 
TTAAAAGATC CCCCTAGAGG GCACACAGAA CCGGTTTTAA GCGTCAGGGG 
p — — _ 

+ 3 T S" N.HSPT SCP PIC PGY 
2651 AACCTCCAAT : CACTCACCAA CCTCCTGTCC TCCAATTTGT CCTGGTTATC 
TTGGAGGTTA GTGAGTGGTT GGAGGACAGG AGGTTAAACA GGACCAATAG 



+3 R W M C L R R FIIF LFI LLL 
2701 GCTGGATGTG TCTGCGGCGT TTTATCATAT TCCTCTTCAT CCTGCTGCTA 
CGACCTACAC AGACGCCGCA AAATAGTATA AGGAGAAGTA GGACGACGAT 



+ 3CLIF L L V L L D Y Q G M LPV 
2*751 TGCCTCATCT :TCTTATTGGT TCTTCTGGAT TATCAAGGTA TGTTGCCCGT 
ACGGAGTAGA AGAATAACCA AGAAGACCTA ATAGTTCCAT ACAACGGGCA 



+3 GPL IPGS TTT STG PCK 

BstAP I 



2 801 TTGTCCTCTA ATTCCAGGAT CAACAACAAC CAGTACGGGA CCATGCAAAA 
AACAGGAGAT TAAGGTCCTA GTTGTTGTTG GTCATGCCCT GGTACGTTTT 



+3TCTT PAQ GNSM FPS CCC 
BstAP I EcoNI 



2851 CCTGCACGAC TCCTGCTCAA GGCAACTCTA TGTTTCCCTC ATGTTGCTGT 
GGACGTGCTG AGGACGAGTT CCGTTGAGAT ACAAAGGGAG TACAACGACA 



+3TKPT DGN CTC IPIP SSW 
2901 ACAAAACCTA CGGATGGAAA TTGCACCTGT ATTCCCATCC CATCGTCCTG 
TGTTTTGGAT GCCTACCTTT AACGTGGACA TAAGGGTAGG GTAGCAGGAC 



+3 A F A KYLW E tf A SVR FSW 
2951 GGCTTTCGCA AAATACCTAT GGGAGTGGGC CTCAGTCCGT TTCTCSTGGC 
CCGAAAGCGT TTTATGGATA CCCTCACCCG GAGTCAGGCA AAGAGAACCG 



+3LSLL VPF VQWF V G L SPT 
3001 TCAGTTTACT AGTGCCATTT GTTCAGTGGT TCGTAGGGCT XTCCCCCACT 
AGTCAAATGA TCACGGTAAA CAAGTCACCA AGCATCCCGA AAGGGGGTGA 



+3VWLS A I W M M W Y W G P SLY 
3051 GTTTGGCTTT CAGCTATATG GATGATGTGG TATTGGGGGC CAAGTCTGTA 
CAAACCGAAA GTCGATATAC CTACTACACC ATAACCCCCG GTTCAGACAT 



+3 SIV SPFI PLL PIF FCL 
3101 CAGCATCGTG AGTCCCTTTA TACCGCTGTT ACCAATTTTC TTTTGTCTCT 
GTCGTAGCAC TCAGGGAAAT ATGGCGACAA TGGTTAAAAG AAAACAGAGA 



+ 3 W V Y I * 

B$tZ17 I Xhol 



Bstll07I PaeR7I AscI 



3151 GGGTATACAT TTAAGAATTC AGACTCGAGC AAGTCXAGAA AGGCGCGCCA FIG. 3F 

CCCATATGTA AATTCTTAAG TCTGAGCTCG TTCAGATCTT TCCGCGCGGT 



7 

3201 


k 

EcoRV 'BamHI Hlul Bc n 

AGATATCAAG GATCCACTAC GCGTTAGAGC TCGCTGATCA GCCTCGACTG 

TCTATAGTTC CTAGGTGATG CGCAATCTPG apppaptapt rrrftrmr^ 

wwvnntv* A^u AoLvAl*XAGT CoGAGCTGAC 






3251 


TGCCTTCTAG TTGCCAGCCA TCTGTTGTTT GCCCCTCCCC CGTGCCTTCC 

ACGGAAGATC AACGGTCGGT AGACAAPAAA rrrnr*rr>nr r»™/*r*r» 

'"'v^vivuv* nuf\w\Aww\ CuouUAuuCaG GCACGGAAGG 






3301 


ttgaccctgg;aaggtgccac tcccactgtc ctttcctaat aaaatgagga 

AACTGGGACC TTCCACGGTG AGGGTGACAG GAAARfJlTTl tttt a r<rrrT 






3351 


AATTGCATCG CATTGTCTGA GTAGGTGTCA TTCTATTCTG GGGGGTGGGG 

TTAACGTAGC GTAACAGACT CATCPAPAftT aaph<™ a™** 

^*«^»^nwnv,i vmv»wtumji AAGATAAGAC CCCCCACCCC 







3401 


TGGGGCAGGA CAGCAAGGGG GAGGATTGGG AAGACAATAG CAGGCATGCT 

ACCCCGTCCT GTCGTTCCCC CTPPTaappp T^nrprTmfrnn ---,^ m . 

^*v#uuvv\*l v*v.v« inAUUl* ilLivjTTATC GTCCGTACGA 






3451 


GGGGAGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC TCGGTCGTTC 

CCCCTCGAGA AGGCGAAGGA GP.GAGTP.apt rirrr-nnnrr n^^r^^ 
«-»w>?w\aj-vrfcvjwr» vJv*vjfvji\jA^i vjftuL-vjACGLG AGCCAGCAAG 






3501 


GGCTGCGGCG AGCGGTATCA GCTCACTCAA AGGCGGTAAT ACGGTTATCC 

CCGACGCCGC TCGCCATAGT CGAGTPAPTT <rrrrnpRmm% 

*vwv\.r»4rvvji s«uau a stAu 1 1 XL-CGCCAxTA xGCCAATAGG 






3551 


Pci I 

ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 

TGTCTTAGTC CCCTATTGCG TCPTTTPTT/2 T-^n^nrrr^nmm n.m^rrm^m 
*■* * v#v^w x l ii^iivj AAt-ACxCGTx TTCCGGTCGT 






3601 


AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC 

TTTCCGGTCC TTGGCATTTT TPPPPPppaa rrsnnn^ * * *> » 

* * * A^wni x x a i<-c»jucuw\a CGACCGCAAA AAGGTATCCG 






3651 


TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG TCAGAGGTGG 

AGGCGGGGGG ACTGCTCGTA GTGTTtttap rTrrrfirqtm^ n^mA»,-«» 
•mwww^www oviVT^iwin oivji x x x ifiiKj CTGCGAGTTC AGTCTCCACC 




— 


3701 


CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 

GCTTTGGGCT GTCMGATAT TTPTATGGTr rrmhro^ 

w v*w*woiAi x luiniuuiL wbUAAAGGGG GACCTTCGAG 






3751 


CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTGTCCG 
GGAGCACGCG AGAGGAPAAP PPTPPfS r nns* rrn * m*+r+r+r*m • Ml# . Ak __ iM>AA 
wunvjw\w«w #\v**vj<jmwwj vj^xuggacgg CGAATGGCCT AT GG ACAGGC 






3801 


CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCAATGCTC ACGCTGTAGG 
GGAAAGAGGG AAGCCCTTPG PArvwvftft * ^rmfn^^it^ — 

rwwvN»wi xwu v*ntu3UjAAA GAGTTACGAG TGCGACATCC 






3851 


TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 
ATAGAGTCAA GCCACA.TPPA ftrn/irrir 1 /* ma^i-wr*/-!/*/** ^.^ 
nia\jnuiuArt v9WAw\iuui v»uaagcgagg TTCGACCCGA CACACGTGCT 






3901 


ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCGGTAAC TATCGTCTTG 
xuu\j\3wwru^ uxuvik>vjuivjvj v^a^vjuGGAA TAGGCCATTG ATAGCAGAAC 






3951 


AGTCCAACCC GGTAAGACAC GACTTATCGC CACTGGCAGC AGCCACTGGT 
TCAGGTTGGG CCATTCTGTG CTGAATAGCG GTGACCGTCG TCGGTGACCA 







4001 AACAGGATTA GCAGAGCGAG GTATGTAGGC GGTGCTACAG AGTTCTTGAA 
TTGTCCTAAT CGTCTCGCTC CATACATCCG CCACGATGTC TCAAGAACTT 



40S1 GTGGTGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT GGTATCTGCG 
CACCACCGGA TTGATGCCGA TGTGATCTTC CTGTCATAAA CCATAGACGC 



FIG. 3G 



f/ 

4101 CTCTGCTGAA 
GAGACGACTT 

4151 GGCAAACAAA 
CCGTTTGTTT 



> :acc 

CGGTlhATGG 

CCACCGCTGG 
GGTGGCGACC 



TTCGGAAAAA 
AAGCCTTTTT 

TAGCGGTGGT 
ATCGCCACCA 



GAGTTGGTAG CTCTTGATuC 
CTCAACCATC GAGAACTAGG 

TTTTTTGTTT GCAAGCAGCA 
AAAAAACAAA CGTTCGTCGT 



4201 GATTACGCGC 
CTAATGCGCG 



AGAAAAAAAG 
TCTTTTTTTC 



GATCTCAAGA 
CTAGAGTTCT 



AGATCCTTTG ATCTTTTCTA 
TCTAGGAAAC TAGAAAAGAT 



4251 CGGGGTCTGA 
GCCCCAGACT 



CGCTCAGTGG 
GCGAGTCACC 



AACGAAAACT 
TTGCTTTTGA 



CACGTTAAGG GATTTTGGTC 
GTGCAATTCC CTAAAACCAG 



4301 ATGAGATTAT 
TACTCTAATA 

4351 AAGTTTTAAA 
TTCAAAATTT 



CAAAAAGGAT 
GTTTTTCCTA 

TCAATCTAAA 
AGTTAGATTT 



CTTCACCTAG 
GAAGTGGATC 

GTATATATGA 
CATATATACT 



ATCCTTTTAA ATTAAAAATG 
TAGGAAAATT TAATTTTTAC 

GTAAACTTGG TCTGACAGTT 
CATTTGAACC AGACTGTCAA 



4 401 ACCAATGCTT 
TGGTTACGAA 




4 451 TCATCCATAG 
AGTAGGTATC 



TTGCCTGACT CCCCGTCGTG 
AACGGACTGA GGGGCAGCAC 



TAGATAACTA CGATACGGGA 
ATCTATTGAT GCTATGCCCT 



4501 GGGCTTACCA 
CCCGAATGGT 



TCTGGCCCCA GTGCTGCAAT 
AGACCGGGGT CACGACGTTA 



GATACCGCGA GACCCACGCT 
CTATGGCGCT CTGGGTGCGA 
Bsal 




4901 



CTGTCATGCC ATCCGTAAGA TGCTTTTCTG TGACTGGTGA 
TTAAGAGAAT GACAGTACGG TAGGCATTCT ACGAAAAGAC ACTGACCACT 



FIG. 3H 



Bcgl 

4951 StSStkS mSSSS ^ tagtg tatgcggcga'ccg^gct 

CATGAGTT6G TTCAGTAAGA CTCTTATCAC ATACGCCGCT GGCTCAACGA 



5001 CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG CAGAACTTTA 
GAACGGGCCG CAGTTATGCC CTATTATGGC cSmc SmrS^T 



XmnI 
Asp7 00 



5051 AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGrAT 
TTTCACGAGT AGTAACCTTT TGCAAGAAGC SSSJJJo 



5101 CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT Grirmir* 
GAATGGCGAC AACTCTAGGT CAAGCTACAT iSJSSS Coram* 



5151 cSScG 525*"* TTTCTGGG ™ AGCAAAAACA 

CTAGAAGTCG tagaaaatga AAGTGGTCGC AAAGACCCAC TGGTTTTTGT 



5201 S£Son£ ilSSSSJ A AGGGCGACAC GGAAATGTTG 

^CCTTCCGTTT TACGGCGTTT TTTCCCTTAT TCCCGCTGTG CCTTTACAAC 

Sspl 



5251 AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATf*ARrr~TT 
TTATGAGTAT GAGAAGGAAA AAGTTATA AT S???CCTaI aS£SK 

5301 ATIGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA 
TAACAGAGTA CTCGCCTATG TATAAACTTA CATaIaTCtY 



5351 ScccSS 2KSKI ^ ccgaaaa gtgcg acctg acgtctaaga 

TATCCCCAAG GCGCGTGTAA AGGGGCTTTT CACGGTGGAC TGCAGATTCT 



5401 i^™^ TT WCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC 
TTGGTAATAA TAGTACTGTA ATTGGA TATT TTTATCCGCA TAGTGCTCCG 

5451 CCTTTCGTC " ~ 

GGAAAGCAG 




FIG. 4A 



7 



jgs assss s?s saaas asss asssa assss sass; 



or 



■ ' _ irrrrrTr TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA 

GCCGGGAGCA GACAAGCCCG TCAGGGCGCG JCAGCGGGTG ^JGCGGG CGACC GAATTGAT AC GCCGTAGTCT 
CGGCCCTCGT CTGTTCGGGC AGTCCCGCGC AGTCGC CCAC ftw^w^ 

■ ,,,^^»rr rrTCCAAAAA AGCCTCCTCA CTACTTCTGG 

»S ggggS SS CGftfiAftACGT SEgSg «~m" TCCC^OT _CC 

■ " I777^77II7^GTCAGCCA TGGGGCGGAG AATGGGCGGA 

AATAGCTCAG AGGCCGAGGC GGOCTCGGCC TCTGCATAAA TA^AAAAAAT JAGTCAG^ ^ TTACCCGCCT 
TTATCGAGTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT Aimm _ 



^ 

^ 321 



EKSS SSSSi SSSS SSSgS S5SSS S>" SSSaS 5BBS5 



401 



481 
561 
641 
721 



■ ' " ' " mm , n ,„., TivrTAATCAA TTACGGGGTC ATTAGTTCAT 

SSI SSSS = S55SK !U TAMCftAGTA 



?sass sss asas sasss ass sassa gga 

■ rrrirriTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT 

GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA gGACTTTCC TACCCACCTC ATAAATGCCA 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTT AT uuww 

" " **r*rrrrcc CCTATTGACG TCAATGACGG TAAATGGCCC 

ass a saas asss sasss asss 

GCCTCGCATT ATGCCO,™ £™C CTACT^CA CT = C — 
CGGACCGTAA TACGGGTCAT GTACTGGAAT GCCCTGAAAG GATGAACCGT lAiu . 



801 



— ■ " ~~ " arTPArCGGG ATTTCCAAGT CTCCACCCCA 

SSSSS SS JSSSSJ 355% " ccc T " c » m ~ T 



881 
961 



1041 



— ■ — ■ ! " p . a R ATGTCGT A ATAACCCCGC CCCGTTGACG 

s aass saass ss sassa as ^™ «*™ 
assgs ggggs sasss sags sasss gaasssas 

™- SE 




1521 CTCGGGTACG TGTTCCGGAC 
GAGCCCATGC ACAAGGCCTG 

1601 GCGGCTCATG GTCGCTCGGC 
CGCCGAGTAC CAGCGAGCCG 
- — 



~~ rrrrrnrrTT rrACATCCGA GCCCTGGTCC CATCCGTCCA 

ga as ggg as ^ 

r^rnrrmr* PTTAGGCACA GCACAATGCC CACCACCACC 

SS SS35SS SSSS SBg™ c^™*- 



1681 AGTGTGCCGC ACAA£GCCGT 
TCACACGGCG TGTTCCGGCA 



nniMv^rarrT PGGAGATTGG GCTCGCACCT GGACGCAGAT 
SS SISS SSKSS C^C^ CCTOCOTC™ 



1761 GGAAGACTTA AGGCAGCGGC 
CCTTCTGAAT TCCGTCGCCG 

1841 TGCGGTGCTG TTAACGGTGG 
ACGCCACGAC AATTGCCACC 



+ 3 



1921 



™™<Pr<r<rrT ATTCTGATAA GAGTCAGAGG TAACTCCCGT 

jSjga s asss sasss 5sss« > 

rc^rpprTTr PTGCCGCGCG CGCCACCAGA CATAATAGCT 

SS SS 

D A 



PstI 



EcoRI 



»«»*» CRGACTGTTC CTTTCCATS0 SS ES 
CTGTCTGATT GTCTGACAAG GAAAGGTACC CAGAAAAGAC GTCAGTGfalA tamw. ^ 



+ 3 GGS AGHT VSG F V S Jl r ^ CG LJ CAGGC^CM GCAGAACGTC CAGCTGATCA 
2M1 SS SS? !SS SS Sc«TT CCTCTT^ 0TC~ 



+ 3NTNG SWH LNST A L N ^ ^ D a r rr m rj ^(^ T cCGGCTGGTT GGCAGGGCTT 

21 " b= SSSSS SS«™»o™ 



+3FYHH KFN SSG C P E R L A S C^R ^ ctTACCGAt/tTGACCAGGG 
2241 TTCTATCACC ACAAGTTCAA CTCTTCAGGC TGTCCTGAGA GGCT^CCAG gggGA^ GAATGGCT AA AACTGGTCCC 
AAGATAGTGG TGTTCAAGTT GAGAAGTCCG ACAGGACTCT t-ct.aiuwiv- ^ 

+3 WGP ISYA NGS G P D 0 * ! * CTG ^ TG ^ciI CTACCCCCCA AAACCTTGCG 

2321 ss sasss ssss ss ss Sccc — . 



+3GIVP AKS VCGP VYC F T P JtfJL^m vgG?gJgPJ& GACCGACAGG 
2401 GTATTGTGCC CGCGAAGAGT GTGTGTGGTC CGGTATATTG CTJCACTCCC ^CCCCGTGG TGGTGGG 

CATAACACGG GCGCTTCTCA CACACACCAG GCCATATAAC GAAGTGAGGG TCGGGGCAll ft^H 



+3 WFG CTWM NST GFT K m V C G A P P TT q TGTCATC GGAGGGGCGG 
2561 TTGGTTCGGT TGTACCTGGA TGAACTCAAC TGGATTCACC ^AGTGTGCG GAGCGCCTCC ™«» C CTCCCCGCC 
AACCAAGCCA ACATGGACCT ACTTGAGTTG ACCTAAGTGG TTTCACACGC CTCGCGGAGfa ftftiftv 



26 u 3 Ic^c J^ScZc clc^%c^ oca™ gg^'sSSg S3SS 

CGTTGTTGTG GGACGTGACG GGGTGACTAA CGAAGGCGTT CGTAGGCCTG CGGTblftlbft o 



7 

+ 3WITP RCL V D Y P Y R L WHY PCT' INYT IFK 

2721 TGGATCACAC CCAGGTGCCT GGTCGACTAC CCGTATAGGC TTTGGCATTA TCCTTGTACC ATCAACTACA CCATATTTAA 

ACCTAGTGTG GGTCCACGGA CCAGCTGATG GGCATATCCG AAACCGTAAT AGGAACATGG TAGTTGATGT GGTATAAATT 



+ 3 I R M YVGG VEH RLE A A C ' N WTR GER CDL 
2801 AATCAGGAT& TACGTGGGAG GGGTCGAACA CAGGCTGGAA GCTGCCTGCA ACTGGACGCG GGGCGAACGT TGCGATCTGG 
TTAGTCCTAC ATGCACCCTC CCCAGCTTGT GTCCGACCTT CGACGGACGT TGACCTGCGC CCCGCTTGCA ACGCTAGACC 



+ 3EDRD RSE IDME NIT S G F LGPL LVL QAG 

Clal 



2881 AAGATAGGGA CAGGTCCGAG ATCGATATGG AGAACATCAC ATCAGGATTC CTAGGACCCC TGCTCGTGTT ACAGGCGGGG 
TTCTATCCCT GTCCAGGCTC TAGCTATACC TCTTGTAGTG TAGTCCTAAG GATCCTGGGG ACGAGCACAA TGTCCGCCCC 



+3FFLL.TRI LTI PQS L DSW WTS LNFL GGS 
2961 TTTTTCTTGT TGACAAGAAT CCTCACAATA CCGCAGAGTC TAGACTCGTG GTGGACTTCT CTCAATTTTC TAGGGGGATC 
AAAAAGAACA ACTGTTCTTA GGAGTGTTAT GGCGTCTCAG ATCTGAGCAC CACCTGAAGA GAGTTAAAAG ATCCCCCTAG 



+ 3 PVC LGQN SQS PTS NHSP TSC PPI CPG 
3041 TCCCGTGTGT CTTGGCCAAA ATTCGCAGTC CCCAACCTCC AATCACTCAC CAACCTCCTG TCCTCCAATT TGTCCTGGTT 
AGGGCACACA GAACCGGTTT TAAGCGTCAG GGGTTGGAGG TTAGTGAGTG GTTGGAGGAC AGGAGGTTAA ACAGGACCAA 



+ 3YRWM C L R RFII FLF I LL LCLI FLL VLL 
3121 ATCGCTGGAT GTGTCTGCGG CGTTTTATCA TATTCCTCTT CATCCTGCTG CTATGCCTCA TCTTCTTATT GGTTCTTCTG 
TAGCGACCTA CACAGACGCC GCAAAATAGT ATAAGGAGAA GTAGGACGAC GATACGGAGT AGAAGAATAA CCAAGAAGAC 



+ 3DYQG MLP VCP LIPG STT TST GPCK TCT 
3201 GATTATCAAG GTATGTTGCC CGTTTGTCCT CTAATTCCAG GATCAACAAC AACCAGTACG GGACCATGCA AAACCTGCAC 
CTAATAGTTC CATACAACGG GCAAACAGGA GATTAAGGTC CTAGTTGTTG TTGGTCATGC CCTGGTACGT TTTGGACGTG 



+ 3 TPA QGNS MFP SCC CTKP TDG NCT CIP 
3281 GACTCCTGCT CAAGGCAACT CTATGTTTCC CTCATGTTGC TGTACAAAAC CTACGGATGG AAATTGCACC TGTATTCCCA 
CTGAGGACGA GTTCCGTTGA GATACAAAGG GAGTACAACG ACATGTTTTG GATGCCTACC TTTAACGTGG ACATAAGGGT 



+3IPSS WAF AKYL WEW ASV RFSW LSL LVP 
3361 TCCCATCGTC CTGGGCTTTC GCAAAATACC TATGGGAGTG GGCCTCAGTC CGTTTCTCTT GGCTCAGTTT ACTAGTGCCA 
AGGGTAGCAG GACCCGAAAG CGTTTTATGG ATACCCTCAC CCGGAGTCAG GCAAAGAGAA CCGAGTCAAA TGATCACGGT 



+3FVQW FVG LSP TVWL SAI WMM WYWG PSL 
3441 TTTGTTCAGT GGTTCGTAGG GCTTTCCCCC ACTGTTTGGC TTTCAGCTAT ATGGATGATG TGGTATTGGG GGCCAAGTCT 
AAACAAGTCA CCAAGCATCC CGAAAGGGGG TGACAAACCG AAAGTCGATA TACCTACTAC ACCATAACCC CCGGTTCAGA 



+ 3 YSI VSPF IPL LPI FFCL WVY I* 

EcoRI 



3521 GTACAGCATC GTGAGTCCCT TTATACCGCT GTTACCAATT TTCTTTTGTC TCTGGGTATA CATTTAAGAA TTCAGACTCG 
CATGTCGTAG CACTCAGGGA AATATGGCGA CAATGGTTAA AAGAAAACAG AGACCCATAT GTAAATTCTT AAGTCTGAGC 



BamHI 



3601 AGCAAGTCTA GAAAGGCGCG CCAAGATATC AAGGATCCAC TACGCGTTAG AGCTCGCTGA TCAGCCTCGA CTGTGCCTTC 
TCGTTCAGAT CTTTCCGCGC GGTTCTATAG TTCCTAGGTG ATGCGCAATC TCGAGCGACT AGTCGGAGCT GACACGGAAG 



3681 TAGTTGCCAG CCATCTGTTG TTTGCCCCTC CCCCGTGCCT TCCTTGACCC TGGAAGGTGC CACTCCCACT GTCCTTTCCT 
ATCAACGGTC GGTAGACAAC AAACGGGGAG GGGGCACGGA AGGAACTGGG ACCTTCCACG GTGAGGGTGA CAGGAAAGGA 



3761 AATAAAATGA GGAAATTGCA TCGCATTGTC TGAGTAGGTG TCATTCTATT CTGGGGGGTG GGGTGGGGCA GGACAGCAAG 
TTATTTTACT CCTTTAACGT AGCGTAACAG ACTCATCCAC AGTAAGATAA GACCCCCCAC CCCACCCCGT CCTGTCGTTC 



FIG. 4D 




3841 GGGGAGGATT GGGAAGACAA TAGCAGGCAT GCTGGGGAGC TCTTCCGCTT CCTCGCTCAC TGACTCGCTG CGCTCGGTCG 
CCCCTCCTAA CCCTTCTGTT ATCGTCCGTA CGACCCCTCG AGAAGGCGAA GGAGCGAGTG ACTGAGCGAC GCGAGCCAGC 



3921 


TTCGGCTGCG GCGAGCGGTA TCAGCTCACT CAAAGGCGGT AATACGGTTA TCCACAGAAT CAGGGGATAA CGCAGGAAAG 
AAGCCGACGC CGCTCGCCAT AGTCGAGTGA GTTTCCGCCA TTATGCCAAT AGGTGTCTTA GTCCCCTATT GCGTCCTTTC 


4001 


AACATGTGAG 
TTGTACACTC 


CAAAAGGCCA GCAAAAGGCC AGGAACCGTA AAAAGGCCGC GTTGCTGGCG TTTTTCCATA GGCTCCGCCC 
GTTTTCCGGT CGTTTTCCGG TCCTTGGCAT TTTTCCGGCG CAACGACCGC AAAAAGGTAT CCGAGGCGGG 


4081 


CCCTGACGAG CATCACAAAA ATCGACGCTC AAGTCAGAGG TGGCGAAACC CGACAGGACT ATAAAGATAC CAGGCGTTTC 
GGGACTGCTC GTAGTGTTTT TAGCTGCGAG TTCAGTCTCC ACCGCTTTGG GCTGTCCTGA TATTTCTATG GTCCGCAAAG 


4161 


CCCCTGGAAG 
GGGGACCTTC 


CTCCCTCGTG CGCTCTCCTG TTCCGACCCT GCCGCTTACC GGATACCTGT CCGCCTTTCT CCCTTCGGGA 
GAGGGAGCAC GCGAGAGGAC AAGGCTGGGA CGGCGAATGG CCTATGGACA GGCGGAAAGA GGGAAGCCCT 


4241 


AGCGTGGCGC 
TCGCACCGCG 


TTTCTCAATG CTCACGCTGT AGGTATCTCA GTTCGGTGTA GGTCGTTCGC TCCAAGCTGG GCTGTGTGCA 
AAAGAGTTAC GAGTGCGACA TCCATAGAGT CAAGCCACAT CCAGCAAGCG AGGTTCGACC CGACACACGT 


4321 


CGAACCCCCC 
GCTTGGGGGG 


GTTCAGCCCG ACCGCTGCGC CTTATCCGGT AACTATCGTC TTGAGTCCAA CCCGGTAAGA CACGACTTAT 
CAAGTCGGGC TGGCGACGCG GAATAGGCCA TTGATAGCAG AACTCAGGTT GGGCCATTCT GTGCTGAATA 


4401 


CGCCACTGGC 
GCGGTGACCG 


AGCAGCCACT GGTAACAGGA TTAGCAGAGC GAGGTATGTA GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG 
TCGTCGGTGA CCATTGTCCT AATCGTCTCG CTCCATACAT CCGCCACGAT GTCTCAAGAA CTTCACCACC 


4481 


CCTAACTACG 
GG AT TGATGC 


GCTACACTAG AAGGACAGTA TTTGGTATCT GCGCTCTGCT GAAGCCAGTT ACCTTCGGAA AAAGAGTTGG 
CGATGTGATC TTCCTGTCAT AAACCATAGA CGCGAGACGA CTTCGGTCAA TGGAAGCCTT TTTCTCAACC 


4561 


TAGCTCTTGA TCCGGCAAAC AAACCACCGC TGGTAGCGGT GGTTTTTTTG TTTGCAAGCA GCAGATTACG CGCAGAAAAA 
ATCGAGAACT AGGCCGTTTG TTTGGTGGCG ACCATCGCCA CCAAAAAAAC AAACGTTCGT CGTCTAATGC GCGTCTTTTT 


4641 


AAGGATCTCA AGAAGATCCT TTGATCTTTT CTACGGGGTC TGACGCTCAG TGGAACGAAA ACTCACGTTA AGGGATTTTG 
TTCCTAGAGT TCTTCTAGGA AACTAGAAAA GATGCCQCAG ACTGCGAGTC ACCTTGCTTT TGAGTGCAAT TCCCTAAAAC 


4721 


GTCATGAGAT TATCAAAAAG GATCTTCACC TAGATCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT AAAGTATATA 
CAGTACTCTA ATAGTTTTTC CTAGAAGTGG ATCTAGGAAA ATTTAATTTT TACTTCAAAA TTTAGTTAGA TTTCATATAT 


4801 


TGAGTAAACT TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA TCTCAGCGAT CTGTCTATTT CGTTCATCCA 
ACTCATTTGA ACCAGACTGT CAATGGTTAC GAATTAGTCA CTCCGTGGAT AGAGTCGCTA GACAGATAAA GCAAGTAGGT 


4881 


TAGTTGCCTG 
ATCAACGGAC 


ACTCCCCGTC GTGTAGATAA CTACGATACG GGAGGGCTTA CCATCTGGCC CCAGTGCTGC AATGATACCG 
TGAGGGGCAG CACATCTATT GATGCTATGC CCTCCCGAAT GGTAGACCGG GGTCACGACG TTACTATGGC 


4961 


CGAGACCCAC 
GCTCTGGGTG 


GCTCACCGGC TCCAGATTTA TCAGCAATAA ACCAGCCAGC CGGAAGGGCC GAGCGCAGAA GTGGTCCTGC 
CGAGTGGCCG AGGTCTAAAT AGTCGTTATT TGGTCGGTCG GCCTTCCCGG CTCGCGTCTT CACCAGGACG 


5041 


AACTTTATCC 
TTGAAATAGG 


GCCTCCATCC AGTCTATTAA TTGTTGCCGG GAAGCTAGAG TAAGTAGTTC GCCAGTTAAT AGTTTGCGCA 
CGGAGGTAGG TCAGATAATT AACAACGGCC CTTCGATCTC ATTCATCAAG CGGTCAATTA TCAAACGCGT 


5121 


ACGTTGTTGC CATTGCTACA GGCATCGTGG TGTCACGCTC GTCGTTTGGT ATGGCTTCAT TCAGGTCCGG TTCCCAACGA 
TGCAACAACG GTAACGATGT CCGTAGCACC ACAGTGCGAG CAGCAAACCA TACCGAAGTA AGTCGAGGCC AAGGGTTGCT 


5201 


TCAAGGCGAG 
AGTTCCGCTC 


TTACATGATC CCCCATGTTG TGCAAAAAAG CGGTTAGCTC CTTCGGTCCT CCGATCGTTG TCAGAAGTAA 
AATGTACTAG GGGGTACAAC ACGTTTTTTC GCCAATCGAG GAAGCCAGGA GGCTAGCAAC AGTCTTCATT 


5281 


GTTGGCCGCA 
CAACCGGCGT 


GTGTTATCAC TCATGGTTAT GGCAGCACTG CATAATTCTC TTACTGTCAT GCCATCCGTA AGATGCTTTT 
CACAATAGTG AGTACCAATA CCGTCGTGAC GTATTAAGAG AATGACAGTA CGGTAGGCAT TCTACGAAAA 
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5361 CTGTGACTGG 
GACACTGACC 



54 41 CGGGATAATA 
GCCCTATTAT 

5521 GATCTTACCG 
CTAGAATGGC 
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CGCAAAGACC 
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5761 GAAAAATAAA 
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GTAATTGGAT 



SSSSSI ^ TAACCCACT CGTGCACCCA ACTGATCTTC AGCATCTTTT ACTTTCACCA 
GACAACTCTA GGTCAAGCTA CATTGGGTGA GCACGTGGGT TGACTAGAAG TCGTAGAAAA TGaIISggJ 



ssse ss ss jess ss gg 

SSSBg 5S Sg gSSSS S 

= SSSS SS ESSE S3SS3 SESS 



TAAAAATAGG CGTATCACGA GGCCCTTTCG TC 
ATTTTTATCC GCATAGTGCT CCGGGAAAGC AG 
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